In this workshop, the aim is to cover some basics of using variables and vectors in R, as well as a start on using strings. We will be covering:
Note to self: move packages to next workshop, just cover how to use functions here. Also move lists back and cover basic vector indexing (which can be more in depth later with data frames)
We will be working in pairs:
What to do when getting stuck:
To get feedback: hand in your R markdown exercise file in the assignment on the Teams channel for the R 1 workshop.
A vector is a set of information contained together in a specific order.
To make a vector you combine variables using the c function (more on functions later); also known as concatenation. To call the c function we use brackets () with the numbers we want separated by a comma.
The first way of making a vector is to add the arguments (numbers) you want.
## [1] 1 6 19 4 9
We can also combine predefined variables and vectors together to create a new vector.
## [1] 1 6 19 4 9 22 7 30
Another way of making a vector is using the colon (:), which can be done without the c function. We can tell R to select a sequence of integers from x to y, or 5 through to 10 in our example.
## [1] 5 6 7 8 9 10
We can also do some basic calculations on vectors. These occur elementwise (one element at a time).
## [1] 1.0 1.2 1.4 1.6 1.8 2.0
As you can see this divides all elements in the vector by 5.
A function is code organised together to perform a specific task. The function will take in an input, perform a task, then return an output. They are the backbone of R, which comes built in with a wide array of functions.
The function(input) format the fundamental way to call and use a function in R. function is the name of the function we are using, input is the argument or data we are passing to the function.
For example:
# running times (mins)
runTimes <- c(31, 50, 15, 19, 23, 34, 9)
# mean running time
meanRun <- mean(runTimes)
meanRun## [1] 25.85714
# tidy up result
meanRun <- round(meanRun, digits = 2)
# print nice result
paste0("Your mean running time is: ", meanRun, " minutes")## [1] "Your mean running time is: 25.86 minutes"
Here we are using the functions c, round, mean, and paste0. We will be using these in our exercises today.
We are on a walking exercise plan, where we increase our step count by a thousand each day, starting at 1000 steps and ending on 12000.
seq function that increases steps from 1000 to 12000 by increments of 1000Indexing is a technical term for accessing elements of a vector. Think of it like selecting books from a book shelf. The vector is your book shelf, you are the index picking what book, or books, you want to read.
Designed by macrovector / Freepik
To index in R you use the square brackets [] after you type the name of the vector to index from. You then put the elements you want to index in the square brackets.
Some examples:
## [1] 9
Indexing elements 1 to 4
## [1] 4 26 11 15
Dropping elements 5 to 7
## [1] 4 26 11 15 1
Indexing 1, 5, and 8
## [1] 4 18 1
If you try and index outside of the vectors range you get an NA. A way of checking is using the length function. Our vector has 8 elements, but we tried to call a 9th.
## [1] NA
## [1] 8
Using indexing you can change the value of an item, or items, in a vector.
## [1] 4 26 11 15 18 9 3 50
## [1] 19 20 21 15 18 9 3 50
You decided to track your total monthly expenditures for the year to find out more about your monthly spending. Such as spending per quarter, biggest spending month, and lowest spending month.
which.max() and which.min() functions, find out which months had the highest and lowest spending.So far we have only been working with numbers and integers. Strings are text based data.
R calls strings characters. You can find out what what type data your variable/vector
## [1] "character"
## [1] "numeric"
# vector of places people are from
places <- c(rep("Hampshire", 2), rep("London", 5), rep("Kent", 1), rep("Surrey", 3))
# counting how many people from each place
table(places)## places
## Hampshire Kent London Surrey
## 2 1 5 3
Using random sampling (sample) and number generation to make vectors, to then make calculations
## [1] 28.388635 19.417827 17.947200 10.803364 21.515738 29.240419 6.985630
## [8] 15.487846 2.067749 15.486720
## [1] 36 2 4 9 8 50 24 16 49 19 27 41 31 44 48 10 42 32 35 7 14 12 46 18 39
Get them to look up rep and seq functions (homework?)
## [1] 1 2 1 2 1 2 1 2 1 2
## [1] 1 3 5 7 9 11 13 15 17 19
If you attended the first R workshop you might remember we calculated a students weighted average grade. Convert this to incorporate 10 students instead of just the one.
exam1 <- c(52, 62, 55, 82, 48, 65, 68, 62, 65, 65)
coursework1 <- c(72, 72, 85, 52, 78, 62, 65, 52, 55, 68)
exam2 <- c(62, 72, 58, 52, 68, 75, 62, 65, 62, 88)
coursework2 <- c(72, 62, 65, 62, 78, 45, 78, 65, 55, 75)
cw_weight <- 0.4
ex_weight <- 0.6
course1 <- (exam1 * ex_weight) + (coursework1 * cw_weight)
course2 <- (exam2 * ex_weight) + (coursework2 * cw_weight)
overall_grade <- (course1 + course2)/2
overall_grade## [1] 63.0 67.0 63.9 63.0 66.0 63.4 67.6 61.5 60.1 74.5